Goto

Collaborating Authors

 McLennan County


Latent Planning via Embedding Arithmetic: A Contrastive Approach to Strategic Reasoning

Hamara, Andrew, Hamerly, Greg, Rivas, Pablo, Freeman, Andrew C.

arXiv.org Artificial Intelligence

Planning in high-dimensional decision spaces is increasingly being studied through the lens of learned representations. Rather than training policies or value heads, we investigate whether planning can be carried out directly in an evaluation-aligned embedding space. We introduce SOLIS, which learns such a space using supervised contrastive learning. In this representation, outcome similarity is captured by proximity, and a single global advantage vector orients the space from losing to winning regions. Candidate actions are then ranked according to their alignment with this direction, reducing planning to vector operations in latent space. We demonstrate this approach in chess, where SOLIS uses only a shallow search guided by the learned embedding to reach competitive strength under constrained conditions. More broadly, our results suggest that evaluation-aligned latent planning offers a lightweight alternative to traditional dynamics models or policy learning.


Revealing the Hidden Third Dimension of Point Defects in Two-Dimensional MXenes

Guinan, Grace, Smeaton, Michelle A., Wyatt, Brian C., Goldy, Steven, Egan, Hilary, Glaws, Andrew, Tucker, Garritt J., Anasori, Babak, Spurgeon, Steven R.

arXiv.org Artificial Intelligence

Point defects govern many important functional properties of two - dimensional ( 2D) materials. However, resolving the three - dimensional (3D) arrangement of these defects in multi - layer 2D materials remains a fundamental challenge, hindering rational defect engineering . Our approach reconstructs the 3D coordinates of vacancies across hundreds of thousands of lattice sites, generating robust statistical insight into their dist ribution that can be correlated with specinullic synthesis pathways. This large - scale data enables us to classify a hierarchy of defect structures -- from isolated vacancies to nanopores -- revealing their preferred formation and interaction mechanisms, as corroborated by molecular dynamics simulations . This work provides a generalizable framework for understanding and ultimately controlling point defects across large volumes, paving the way for the rational design of defect - engineered functional 2D materials. Keywords: 2D materials, point defects, autonomous materials science, electron microscopy, machine learning 2 Two - dimensional (2D) materials have become a major nullield of modern research in materials science after the discovery of graphene in 2004 . The challenge of characterizing point defects is signinullicantly amplinullied in few - layered 2D materials. For instance, MXenes -- a class of 2D transition metal carbides, carbonitrides, and nitrides -- consist of nanosheets containing two to nullive layers of metal ato ms, which complicates defect analysis compared to single - layer materials .


Fine-Tuning Vision-Language Models for Multimodal Polymer Property Prediction

Vuong, An, Van, Minh-Hao, Verma, Prateek, Zhao, Chen, Wu, Xintao

arXiv.org Artificial Intelligence

Vision-Language Models (VLMs) have shown strong performance in tasks like visual question answering and multimodal text generation, but their effectiveness in scientific domains such as materials science remains limited. While some machine learning methods have addressed specific challenges in this field, there is still a lack of foundation models designed for broad tasks like polymer property prediction using multimodal data. In this work, we present a multimodal polymer dataset to fine-tune VLMs through instruction-tuning pairs and assess the impact of multimodality on prediction performance. Our fine-tuned models, using LoRA, outperform unimodal and baseline approaches, demonstrating the benefits of multimodal learning. Additionally, this approach reduces the need to train separate models for different properties, lowering deployment and maintenance costs.


Robust DDoS-Attack Classification with 3D CNNs Against Adversarial Methods

Bragg, Landon, Dorsey, Nathan, Prior, Josh, Ajit, John, Kim, Ben, Willis, Nate, Rivas, Pablo

arXiv.org Artificial Intelligence

Distributed Denial-of-Service (DDoS) attacks remain a serious threat to online infrastructure, often bypassing detection by altering traffic in subtle ways. We present a method using hive-plot sequences of network data and a 3D convolutional neural network (3D CNN) to classify DDoS traffic with high accuracy. Our system relies on three main ideas: (1) using spatio-temporal hive-plot encodings to set a pattern-recognition baseline, (2) applying adversarial training with FGSM and PGD alongside spatial noise and image shifts, and (3) analyzing frame-wise predictions to find early signals. On a benchmark dataset, our method lifts adversarial accuracy from 50-55% to over 93% while maintaining clean-sample performance. Frames 3-4 offer strong predictive signals, showing early-stage classification is possible.


Modality-Aware Infrared and Visible Image Fusion with Target-Aware Supervision

Sun, Tianyao, Xiang, Dawei, Ding, Tianqi, Fang, Xiang, Qi, Yijiashun, Zhao, Zunduo

arXiv.org Artificial Intelligence

Infrared and visible image fusion (IVIF) is a fundamental task in multi-modal perception that aims to integrate complementary structural and textural cues from different spectral domains. In this paper, we propose FusionNet, a novel end-to-end fusion framework that explicitly models inter-modality interaction and enhances task-critical regions. FusionNet introduces a modality-aware attention mechanism that dynamically adjusts the contribution of infrared and visible features based on their discriminative capacity. To achieve fine-grained, interpretable fusion, we further incorporate a pixel-wise alpha blending module, which learns spatially-varying fusion weights in an adaptive and content-aware manner. Moreover, we formulate a target-aware loss that leverages weak ROI supervision to preserve semantic consistency in regions containing important objects (e.g., pedestrians, vehicles). Experiments on the public M3FD dataset demonstrate that FusionNet generates fused images with enhanced semantic preservation, high perceptual quality, and clear interpretability. Our framework provides a general and extensible solution for semantic-aware multi-modal image fusion, with benefits for downstream tasks such as object detection and scene understanding.


Near Real-Time Dust Aerosol Detection with 3D Convolutional Neural Networks on MODIS Data

Gates, Caleb, Moorhead, Patrick, Ferguson, Jayden, Darwish, Omar, Stallman, Conner, Rivas, Pablo, Quansah, Paapa

arXiv.org Artificial Intelligence

Dust storms harm health and reduce visibility; quick detection from satellites is needed. We present a near real-time system that flags dust at the pixel level using multi-band images from NASA's Terra and Aqua (MODIS). A 3D convolutional network learns patterns across all 36 bands, plus split thermal bands, to separate dust from clouds and surface features. Simple normalization and local filling handle missing data. An improved version raises training speed by 21x and supports fast processing of full scenes. On 17 independent MODIS scenes, the model reaches about 0.92 accuracy with a mean squared error of 0.014. Maps show strong agreement in plume cores, with most misses along edges. These results show that joint band-and-space learning can provide timely dust alerts at global scale; using wider input windows or attention-based models may further sharpen edges.


A Data-Driven Approach to Enhancing Gravity Models for Trip Demand Prediction

Acharya, Kamal, Lad, Mehul, Sun, Liang, Song, Houbing

arXiv.org Artificial Intelligence

Accurate prediction of trips between zones is critical for transportation planning, as it supports resource allocation and infrastructure development across various modes of transport. Although the gravity model has been widely used due to its simplicity, it often inadequately represents the complex factors influencing modern travel behavior. This study introduces a data-driven approach to enhance the gravity model by integrating geographical, economic, social, and travel data from the counties in Tennessee and New York state. Using machine learning techniques, we extend the capabilities of the traditional model to handle more complex interactions between variables. Our experiments demonstrate that machine learning-enhanced models significantly outperform the traditional model. Our results show a 51.48% improvement in R-squared, indicating a substantial enhancement in the model's explanatory power. Also, a 63.59% reduction in Mean Absolute Error (MAE) reflects a significant increase in prediction accuracy. Furthermore, a 44.32% increase in Common Part of Commuters (CPC) demonstrates improved prediction reliability. These findings highlight the substantial benefits of integrating diverse datasets and advanced algorithms into transportation models. They provide urban planners and policymakers with more reliable forecasting and decision-making tools.


Hyperproperty-Constrained Secure Reinforcement Learning

Bonnah, Ernest, Nguyen, Luan Viet, Hoque, Khaza Anuarul

arXiv.org Artificial Intelligence

Hyperproperties for Time Window Temporal Logic (HyperTWTL) is a domain-specific formal specification language known for its effectiveness in compactly representing security, opacity, and concurrency properties for robotics applications. This paper focuses on HyperTWTL-constrained secure reinforcement learning (SecRL). Although temporal logic-constrained safe reinforcement learning (SRL) is an evolving research problem with several existing literature, there is a significant research gap in exploring security-aware reinforcement learning (RL) using hyperproperties. Given the dynamics of an agent as a Markov Decision Process (MDP) and opacity/security constraints formalized as HyperTWTL, we propose an approach for learning security-aware optimal policies using dynamic Boltzmann softmax RL while satisfying the HyperTWTL constraints. The effectiveness and scalability of our proposed approach are demonstrated using a pick-up and delivery robotic mission case study. We also compare our results with two other baseline RL algorithms, showing that our proposed method outperforms them.


A Survey of AI for Materials Science: Foundation Models, LLM Agents, Datasets, and Tools

Van, Minh-Hao, Verma, Prateek, Zhao, Chen, Wu, Xintao

arXiv.org Artificial Intelligence

Foundation models (FMs) are catalyzing a transformative shift in materials science (MatSci) by enabling scalable, general-purpose, and multimodal AI systems for scientific discovery. Unlike traditional machine learning models, which are typically narrow in scope and require task-specific engineering, FMs offer cross-domain generalization and exhibit emergent capabilities. Their versatility is especially well-suited to materials science, where research challenges span diverse data types and scales. This survey provides a comprehensive overview of foundation models, agentic systems, datasets, and computational tools supporting this growing field. We introduce a task-driven taxonomy encompassing six broad application areas: data extraction, interpretation and Q\&A; atomistic simulation; property prediction; materials structure, design and discovery; process planning, discovery, and optimization; and multiscale modeling. We discuss recent advances in both unimodal and multimodal FMs, as well as emerging large language model (LLM) agents. Furthermore, we review standardized datasets, open-source tools, and autonomous experimental platforms that collectively fuel the development and integration of FMs into research workflows. We assess the early successes of foundation models and identify persistent limitations, including challenges in generalizability, interpretability, data imbalance, safety concerns, and limited multimodal fusion. Finally, we articulate future research directions centered on scalable pretraining, continual learning, data governance, and trustworthiness.


Gradient Flow Matching for Learning Update Dynamics in Neural Network Training

Shou, Xiao, Ding, Yanna, Gao, Jianxi

arXiv.org Machine Learning

Training deep neural networks remains computationally intensive due to the itera2 tive nature of gradient-based optimization. We propose Gradient Flow Matching (GFM), a continuous-time modeling framework that treats neural network training as a dynamical system governed by learned optimizer-aware vector fields. By leveraging conditional flow matching, GFM captures the underlying update rules of optimizers such as SGD, Adam, and RMSprop, enabling smooth extrapolation of weight trajectories toward convergence. Unlike black-box sequence models, GFM incorporates structural knowledge of gradient-based updates into the learning objective, facilitating accurate forecasting of final weights from partial training sequences. Empirically, GFM achieves forecasting accuracy that is competitive with Transformer-based models and significantly outperforms LSTM and other classical baselines. Furthermore, GFM generalizes across neural architectures and initializations, providing a unified framework for studying optimization dynamics and accelerating convergence prediction.